The Limiting Distribution for the Number of Symbol Comparisons Used by Quicksort Is Nondegenerate

نویسندگان

  • PATRICK BINDJEME
  • JAMES ALLEN FILL
چکیده

In a continuous-time setting, Fill [2] proved, for a large class of probabilistic sources, that the number of symbol comparisons used by QuickSort, when centered by subtracting the mean and scaled by dividing by time, has a limiting distribution, but proved little about that limiting random variable Y —not even that it is nondegenerate. We establish the nondegeneracy of Y . The proof is perhaps surprisingly difficult. 1. The number of symbol comparisons used by QuickSort: Brief review of a limiting-distribution result In this section we briefly review the main theorem of [2]. An infinite sequence of independent and identically distributed keys is generated; each key is a random word (w1, w2, . . .) = w1w2 · · · , that is, an infinite sequence, or “string”, of symbols wi drawn from a totally ordered finite alphabet Σ. The common distribution μ of the keys (called a probabilistic source) is allowed to be any distribution over words, i.e., the distribution of any stochastic process with time parameter set {1, 2, . . . } and state space Σ . We know thanks to Kolmogorov’s consistency criterion (e.g., Theorem 3.3.6 in [1]) that the possible distributions μ are in one-to-one correspondence with consistent specifications of finite-dimensional marginals, i.e., of the fundamental probabilities (1.1) pw := μ({w1w2 · · ·wk} × Σ∞) with w = w1w2 · · ·wk ∈ Σ∗. This pw is the probability that a word drawn from μ has w as its length-k prefix. For each n, Hoare’s [6] QuickSort algorithm can be used to sort the first n keys to be generated. We may and do assume that the first key in the sequence is chosen as the pivot, and that the same is true recursively (in the sense, for example, that the pivot used to sort the keys smaller than the original pivot is the first key to be generated that is smaller than the original pivot). A comparison of two keys is done by scanning the two words from left to right, comparing the symbols of matching index one by one until a difference is found. We let Sn denote the total number of symbol comparisons needed when n keys are sorted by QuickSort. Theorem 1.1 (Fill [2], Theorem 3.1). Consider the continuous-time setting in which keys are generated from a probabilistic source at the arrival times of an independent Poisson process N with unit rate. Let S(t) = SN(t) denote the number Date: January 27, 2012. Research supported by the Acheson J. Duncan Fund for the Advancement of Research in Statistics. 1 2 PATRICK BINDJEME JAMES ALLEN FILL of symbol comparisons required by QuickSort to sort the keys generated through epoch t, and let (1.2) Y (t) := S(t)−ES(t) t , 0 < t <∞. Assume that (1.3) ∞ ∑

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The limiting distribution for the number of symbol comparisons used by QuickSort is nondegenerate (extended abstract)

In a continuous-time setting, Fill (2012) proved, for a large class of probabilistic sources, that the number of symbol comparisons used by QuickSort, when centered by subtracting the mean and scaled by dividing by time, has a limiting distribution, but proved little about that limiting random variable Y —not even that it is nondegenerate. We establish the nondegeneracy of Y . The proof is perh...

متن کامل

Distributional convergence for the number of symbol comparisons used by QuickSort

Most previous studies of the sorting algorithm QuickSort have used the number of key comparisons as a measure of the cost of executing the algorithm. Here we suppose that the n independent and identically distributed (iid) keys are each represented as a sequence of symbols from a probabilistic source and that QuickSort operates on individual symbols, and we measure the execution cost as the num...

متن کامل

Approximating the limiting Quicksort distribution

The limiting distribution of the normalized number of comparisons used by Quicksort to sort an array of n numbers is known to be the unique fixed point with zero mean of a certain distributional transformation S. We study the convergence to the limiting distribution of the sequence of distributions obtained by iterating the transformation S, beginning with a (nearly) arbitrary starting distribu...

متن کامل

Smoothness and decay properties of the limiting Quicksort density function

Using Fourier analysis, we prove that the limiting distribution of the standardized random number of comparisons used by Quicksort to sort an array of n numbers has an everywhere positive and infinitely differentiable density f , and that each derivative f (k) enjoys superpolynomial decay at ±∞. In particular, each f (k) is bounded. Our method is sufficiently computational to prove, for example...

متن کامل

Quicksort Algorithm Again Revisited

We consider the standard Quicksort algorithm that sorts n distinct keys with all possible n! orderings of keys being equally likely. Equivalently, we analyze the total path length n in a randomly built binary search tree. Obtaining the limiting distribution of n is still an outstanding open problem. In this paper, we establish an integral equation for the probability density of the number of co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012